Generation of an image that is devoid of a person from images that include the person

ABSTRACT

Certain embodiments disclosed herein generate an image that is devoid of a person. Such an embodiment can includes using a camera to obtain a first image of a scene while a person is at a first location within the FOV of the camera, obtaining a second image of the scene while the person is at a second location within the FOV of the camera, and generating, based on the first and second images, a third image of the scene, such that the third image of the scene is devoid of the person and includes portions of the scene that were blocked by the person in the first and second images. Other embodiments disclosed herein determine spatial information for one or more items of interest within a graphical representation of a region generated based on one or more images of the region captured using a camera of a mobile device.

PRIORITY CLAIM

This application is a Divisional of U.S. patent application Ser. No.16/795,857, filed Feb. 20, 2020, which claims priority to U.S.Provisional Patent Application No. 62/810,470, filed Feb. 26, 2019.Priority is claimed to each of the above applications, and each of theabove applications is incorporated herein by reference in its entirety.

BACKGROUND

It is often useful to have a schematic, blueprint or other graphicalrepresentation of rooms of a building when either moving furniture,buying new furniture, buying carpets or rugs, remodeling, repainting orotherwise modifying or cataloguing characteristic of rooms or elementstherein. Various products exist, which are supposed to assist users inproducing such graphical representations. Some such products, which areimplemented using software, typically require that a user spend a largeamount of time taking manual measurements of rooms and items therein andthen manually entering such measurements into a computing device toenable the software running on computing device to generate models basedon the manually entered information. More recently, special cameras havebeen developed that remove some of the manual procedure previouslynecessary. For example, 360-degree cameras are available that have afield of view (FOV) that covers a full circle in the horizontal plane.Such a camera can be placed in a room (or other region) to obtain a360-degree image of the room (or other region) from which a virtual tourof the room (or other region) can be generated. One potential problem ofusing a 360-degree camera is that the person (aka photographer) that iscontrolling the camera can inadvertently or at least undesirably becaptured in the 360-degree image because, unlike when using a moreconventional camera having a smaller FOV, the person cannot easily standoutside the FOV of the camera (e.g., by standing behind the camera). Oneway to overcome this problem is for the person (aka photographer) toplace the 360-degree camera on a tripod and then stand in another roomand use a remote control to trigger the 360-degree camera. However, sucha solution is not optimal as it increases the time and complexityassociated with obtaining the 360-degree images of rooms and/or otherregions.

After a schematic, blueprint or other graphical representation of a room(or other region) has been generated, it would be useful if certainitems of interest in the graphical representation can be tagged andspatial coordinates of such items of interest can be determined. Forexample, assume a schematic, blueprint or other graphical representationof rooms of a rental unit are being used to specify items of interestthat need to be repaired before a tenant either moves into or out of therental unit. It would be beneficial if the items that need to berepaired can be easily and readily tagged within the schematic,blueprint or other graphical representation in a manner that providesgood specificity, e.g., to clearly indicate which one of numerouscabinet pulls in a kitchen needs to be repaired, or to clearly indicatewhich one of a number of door knobs in a bedroom needs to be repaired.

SUMMARY

Certain embodiments of the present invention can be used to essentiallyremove a person from an image, or more specifically, to generate animage that is devoid of a person. A method of such an embodimentincludes using a camera to obtain a first image (A) of a scene within aFOV of the camera while a person is at a first location within the FOVof the camera, and thus, the person appears in a first portion of thefirst image (A). The method further includes obtaining a second image(B) of the scene within the FOV of the camera while the person is at asecond location within the FOV of the camera that differs from the firstlocation, and thus, the person appears in a second portion of the secondimage (B) that differs from the first portion of the first image (A).Additionally, the method includes generating, based on the first andsecond images (A and B), a third image (C) of the scene, such that thethird image (C) of the scene is devoid of the person and includesportions of the scene that were blocked by the person in the first andsecond images (A and B), wherein the generating is performing using oneor more processors. While such embodiments are especially useful with a360-degree camera, such embodiments are also useful with cameras havingother FOVs, such as a FOV that is 120 degrees or 180 degrees, but notlimited thereto. In certain embodiments, the first and second images (Aand B) are captured using a 360-degree camera (or another camera havingsome other FOV) that is being controlled by a mobile computing devicethat is in wireless communication camera. Such a mobile computing devicethat controls the 360-degree camera (or another camera having some otherFOV) can be, e.g., a smartphone or a tablet type of mobile computingdevice, but is not limited thereto. An application installed on such amobile computing device can be used to control the 360-degree camera (oranother camera having some other FOV), as well as to generate the thirdimage (C) of the scene that is devoice of the person and includesportions of the scene that were blocked by the person in the first andsecond images (A and B).

In accordance with certain embodiments, the third image (C) of the sceneis generate using computer vision to identify the person within each ofthe first and second images (A and B), and combining a portion of thefirst image (A) that is devoid of the person with a portion of thesecond image (B) that is devoid of the person to produce the third image(C) of the scene that is devoid of the person and includes the portionsof the scene that were blocked by the person in the first and secondimages.

In accordance with certain embodiments, the third image (C) of the sceneis generated by: identifying first and second portions (A1, A2) of thefirst image (A) that differ from the second image (B); identifying firstand second portions (B1, B2) of the second image (B) that differ fromthe first image (A); determining a first metric of similarity (a1)indicative of similarity between the first portion (A1) of the firstimage (A) that differs from the second image (B) and a remaining portionof the first image (A); determining a second metric of similarity (a2)indicative of similarity between the second portion (A2) of the firstimage (A) that differs from the second image (B) and a remaining portionof the first image (A); determining a third metric of similarity (1)indicative of similarity between the first portion (B1) of the secondimage (B) that differs from the first image (A) and a remaining portionof the second image (B); and determining a fourth metric of similarity(b2) indicative of similarity between the second portion (B2) of thesecond image (B) that differs from the first image (A) and a remainingportion of the first image (A). Further, the third image (C) of thescene is generated by determining, based on the first, second, third,and fourth metrics of similarity (a1, a2, b1, b2), which one of thefirst portion (A1) of the first image (A) and the first portion (B1) ofthe second image (B) is to be included the third image (C), and whichone of the second portion (A2) of the first image (A) and the secondportion (B2) of the second image (B) is to be included the third image(C). More specifically, this may include comparing a sum of the firstand fourth metrics (a1+b2) to a sum of the second and third metrics(a2+b3), e.g., to determine whether or not the sum of the first andfourth metrics (a1+b2) is less than the sum of the second and thirdmetrics (a2+b3). Then, based on results of the comparing, there is adetermination of which one of the first portion (A1) of the first image(A) and the first portion (B1) of the second image (B) is to be includedthe third image (C), and which one of the second portion (A2) of thefirst image (A) and the second portion (B2) of the second image (B) isto be included the third image (C).

In accordance with certain embodiments, for each of the first, second,third, and fourth metrics of similarity (a1, a2, b1, b2), a lowermagnitude is indicative of higher similarity, and higher magnitude isindicative of a lower similarity. In such embodiments the comparingcomprises determining whether the sum of the first and fourth metrics(a1+b2) is less than or greater than the sum of the second and thirdmetrics (a2+b3). In response to determining that the sum of the firstand fourth metrics (a1+b2) is less than the sum of the second and thirdmetrics (a2+b3), there is a determination that the first portion (A1) ofthe first image (A) and the second portion (B2) of the second image (B)are to be included in the third image (C). On the other hand, inresponse to determining that the sum of the first and fourth metrics(a1+b2) is greater than the sum of the second and third metrics (a2+b3),there is a determination that the second portion (A2) of the first image(A) and the first portion (B1) of the second image (B) are to beincluded in the third image (C).

In accordance with other embodiments, for each of the first, second,third, and fourth metrics of similarity (a1, a2, b1, b2), a lowermagnitude is indicative of lower similarity, and higher magnitude isindicative of a higher similarity. In such embodiments the comparingcomprises determining whether the sum of the first and fourth metrics(a1+b2) is less than or greater than the sum of the second and thirdmetrics (a2+b3). In response to determining that the sum of the firstand fourth metrics (a1+b2) is greater than the sum of the second andthird metrics (a2+b3), there is a determination that the first portion(A1) of the first image (A) and the second portion (B2) of the secondimage (B) are to be included in the third image (C). On the other hand,in response to determining that the sum of the first and fourth metrics(a1+b2) is less than the sum of the second and third metrics (a2+b3),there is a determination that the second portion (A2) of the first image(A) and the first portion (B1) of the second image (B) are to beincluded in the third image (C).

Certain embodiments of the present technology are also directed to oneor more processor readable storage devices having instructions encodedthereon which when executed cause one or more processors to perform themethods summarized above.

Certain embodiments of the present technology are related to a methodfor use with a first mobile device comprising a first camera and asecond mobile device, wherein the method is for determining spatialinformation for one or more items of interest within a graphicalrepresentation of a region generated based on one or more images of theregion captured using the first camera of the first mobile device. Sucha method comprises capturing one or more images of the region using thefirst camera of the first mobile device and generating or otherwiseobtaining the graphical representation of the region based on the one ormore images of the region captured using the first camera of the firstmobile device. The method also includes, for each item of interest, ofthe one or more items of interest, using the first camera of the firstmobile device to capture one or more further images of the region whilethe second mobile device is placed in close proximity to the item ofinterest, and thus, the second mobile device appears in the one or morefurther images. The method further includes, for each item of interest,of the one or more items of interest, determining spatial informationfor the item of interest based on the one or more further images of theregion within which the second mobile device appears.

In accordance with certain embodiments, the second mobile deviceincludes a front side on which is located a display and a front sidecamera, and a back side on which is located a back side camera, and themethod includes: displaying an indicator on the display of the secondmobile device, such that the indicator will be shown in the one or moreimages of the region captured using the first camera of the first mobiledevice.

In accordance with certain embodiments, determining spatial information,for an item of interest based on the one or more further images of theregion within which the second mobile device appears, comprisesintersecting a ray from a center of the first camera of the first mobiledevice to the second mobile device that appears near the item ofinterest within the graphical representation of the region.

In accordance with certain embodiments, an item of interest within thegraphical representation of the region is identified based on theindicator on the display of the second mobile device included in the oneor more images of the region captured using the first camera of thefirst mobile device.

In accordance with certain embodiments the method further comprises, foreach item of interest, of the one or more items of interest: capturing afurther image that includes the first mobile device, using the frontside camera of the second mobile device; and using the further image,captured using the front side camera of the second mobile device, toincrease at least one of reliability or accuracy of the spatialinformation determined for the item of interest.

In accordance with certain embodiments, the first camera of the firstmobile device comprises a 360-degree camera. In accordance with certainembodiments, the second mobile device comprises one of a smartphone or atablet type of mobile computing device.

In accordance with certain embodiments, determining spatial information,for an item of interest based on the one or more further images of theregion within which the second mobile device appears, comprisesidentifying an arm or other body part of a person holding the secondmobile device, and intersecting a ray from a center of the first cameraof the first mobile device to the identified arm of other body part ofthe person located near the item of interest within a graphicalrepresentation of the region.

Certain embodiments of the present technology are also directed to oneor more processor readable storage devices having instructions encodedthereon which when executed cause one or more processors to perform themethods summarized above.

A system according to certain embodiments of the present technologycomprises: a first mobile device comprising a first camera that is usedto capture one or more images of a region, which one or more images areused to generate a graphical representation of the region; and a secondmobile device comprising one or more processors. In certain suchembodiments, the first mobile device is configured to capture using thefirst camera thereof, for each item of interest of one or more items ofinterest, one or more further images of the region while the secondmobile device is placed in close proximity to the item of interest, andthus, the second mobile device appears in the one or more furtherimages. At least one of the one or more processors of the second mobiledevice is configured to determine, for each item of interest of the oneor more items of interest, spatial information for the item of interestbased on the one or more further images of the region within which thesecond mobile device appears. The graphical representation of the regioncan be generated using one or more processors of the first mobiledevice, of the second mobile device and/or of a server that receives theone or more images of the region captured by the first camera of thefirst mobile device.

In accordance with certain embodiments, at least one of the one or moreprocessors of the second mobile device is configured to determine, foreach item of interest of the one or more items of interest, spatialinformation for the item of interest by intersecting a ray from a centerof the first camera of the first mobile device to the second mobiledevice that appears near the item of interest within the graphicalrepresentation of the region.

In accordance with certain embodiments, the second mobile deviceincludes a front side on which is located a display and a front sidecamera, and a back side on which is located a back side camera; thesecond mobile device is configured to display an indicator on thedisplay thereof, such that the indicator will be shown in the one ormore images of the region captured using the first camera of the firstmobile device; and an item of interest within the graphicalrepresentation of the region is identified based on the indicator on thedisplay of the second mobile device included in the one or more imagesof the region captured using the first camera of the first mobiledevice.

In accordance with certain embodiments, the second mobile device isconfigured, for each item of interest of the one or more items ofinterest, to capture a further image that includes the first mobiledevice, using the front side camera of the second mobile device; and thesecond mobile device is configured use the further image, captured usingthe front side camera of the thereof, to increase at least one ofreliability or accuracy of the spatial information determined for atleast one item of interest.

In accordance with certain embodiments, at least one of the one or moreprocessors of the second mobile device is configured to determine thespatial information for an item of interest by identifying an arm orother body part of a person holding the second mobile device, andintersecting a ray from a center of the first camera of the first mobiledevice to the identified arm or other body part of the person locatednear the item of interest within a graphical representation of theregion.

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used as an aid in determining the scope of the claimed subjectmatter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an exemplary components of a mobile computing devicewith which embodiments of the present technology can be used.

FIGS. 2A and 2B are, respectively, front and back views of an exemplarysmartphone type of mobile computing device with which embodiments of thepresent technology can be used.

FIG. 3 illustrates exemplary components of a 360-degree camera withwhich embodiments of the present technology can be used.

FIG. 4 is used to illustrate that a smartphone type of mobile computingdevice and a camera can communicate with one another, and that thesmartphone type of mobile computing device can use a communicationnetwork to upload data to, and download data from, a remote system thatincludes one or more servers.

FIG. 5A shows an image of a room while a person is located at a firstlocation; FIG. 5B shown an image of the room while the person is locatedat a second location; and FIG. 5C shows an image of the room, generatedbased on the images of FIGS. 5A and 5B, that is devoid of the personshown in the images of FIGS. 5A and 5B.

FIG. 6 is a high level flow diagram that is used to summarize methodsaccording to certain embodiments of the of the present technology thatcan be used to generate an image of a region (e.g., room) that is devoidof a person based on two images of the region that include the personlocated at two different locations.

FIG. 7 is a high level diagram that is used to describe additionaldetails of one of the steps introduced in FIG. 6, according to certainembodiments of the present technology.

FIG. 8A shows a first image of a room after a person in the image hasbeen identified and removed from the first image; FIG. 8B shows a secondimage of a room after the person in the image has been identified andremoved from the second image; and FIG. 8C shows a third image of theroom generated based on the first and second images, wherein the personis not present in the third image of the room.

FIG. 9 is a high level diagram that is used to describe additionaldetails of one of the steps introduced in FIG. 6, according to certainembodiments of the present technology.

FIGS. 10A and 10B show first and second images of a same region (e.g.,room) with differences between the first and second images identified;and FIG. 10C shows a third image that is devoid of the person (shown inFIGS. 10A and 10B) and that can be generated using embodiments of thepresent technology described with reference to FIG. 9.

FIG. 11 shows an example of an image of a room captured by a 360-degreecamera, where a smartphone is placed in close proximity to an item ofinterest, and thus, the smartphone appears in the captured image.

FIG. 12 shows that an application can display an arrow, or otheridentifiable indicator, on a display of a smartphone so that theidentifiable indicator (e.g., the arrow) will be shown in an imagecaptured by a 360-degree camera, so that the indicator can be used togenerate coordinates and/or other metadata for an item of interest inclose proximity to the identifiable indicator (e.g., the arrow).

FIG. 13 is a high level flow diagram that is used to summarize methodsaccording to certain embodiments of the of the present technology thatcan be used to determine spatial information for one or more items ofinterest within a graphical representation of a region generated basedon one or more images of the region captured using a camera of a mobiledevice.

FIG. 14 shows an example a floor plan that can be generated from one ormore images captured by a 360-degree camera, wherein the floor plan isannotated to show certain items of interest in accordance with certainembodiments of the present technology.

FIG. 15 shows an example of a three-dimensional graphical representationof a region that includes a marker pin specifying where a repair isneeded.

DETAILED DESCRIPTION

FIG. 1 illustrates an exemplary mobile computing device 102 with whichembodiments of the present technology described herein can be used. Themobile computing device 102 can be a smartphone, such as, but notlimited to, an iPhone™, a Blackberry™, an Android™-based or aWindows™-based smartphone. The mobile computing device 102 canalternatively be a tablet computing device, such as, but not limited to,an iPad™, an Android™-based or a Windows™-based tablet. For anotherexample, the mobile computing device 102 can be iPod Touch™, or thelike.

Referring to the block diagram of FIG. 1, the mobile computing device102 is shown as including a front-facing camera 104 a, a rear-facingcamera 104 b, an accelerometer 106, a magnetometer 108, a gyroscope 110,a microphone 112, a display 114 (which may or may not be a touch screendisplay), a processor 116, memory 118, a transceiver 120, a speaker 122and a drive unit 124. Each of these elements is shown as being connectedto a bus 128, which enables the various components to communicate withone another and transfer data from one element to another. It is alsopossible that some of the elements can communicate with one anotherwithout using the bus 128.

The front of the mobile computing device 102 is the major side of thedevice on which the display 114 (which may or may not be a touch screendisplay) is located, and the back of the mobile computing device 102 isthe other or opposite major side. The rear-facing camera 104 b islocated on the back of the mobile computing device 102. The front-facingcamera 142 a is located on the front of the mobile computing device 102.The front-facing camera 104 a can be used to obtain images or video,typically of the person holding the mobile computing device 102. Therear-facing camera 104 b can be used to obtain images or video,typically of a scene and/or of a person other than the person holdingthe mobile computing device 102.

The accelerometer 106 can be used to measure linear accelerationrelative to a frame of reference, and thus, can be used to detect motionof the mobile computing device 102 as well as to detect an angle of themobile device 102 relative to the horizon or ground. The magnetometer108 can be used as a compass to determine a direction of magnetic northand bearings relative to magnetic north. The gyroscope 110 can be usedto detect both vertical and horizontal orientation of the mobilecomputing device 102, and together with the accelerometer 106 andmagnetometer 108 can be used to obtain very accurate information aboutthe orientation of the mobile computing device 102. The microphone 112can be used to detect voice commands for controlling the mobilecomputing device 102, as well as for enabling the mobile computingdevice 102 to operate as a mobile phone, e.g., if the mobile computingdevice 102 is a smartphone. It is also possible that the mobilecomputing device 102 includes additional sensor elements, such as, butnot limited to, an ambient light sensor and/or a proximity sensor.

The display 114, which many or not be a touch screen type of display,can be used as a user interface to visually display items (e.g., images,options, instructions, etc.) to a user and accept inputs from a user.Further, the mobile computing device 102 can include additionalelements, such as keys, buttons, a track-pad, a trackball, or the like,that accept inputs from a user.

The memory 118 can be used to store software and/or firmware thatcontrols the mobile computing device 102, as well to store imagescaptured using the camera 104, but is not limited thereto. Variousdifferent types of memory, including non-volatile and volatile memorycan be included in the mobile computing device 102. The drive unit 124,e.g., a hard drive, but not limited thereto, can also be used to storesoftware that controls the mobile computing device 102, as well to storeimages captured using the camera 104, but is not limited thereto. Thememory 118 and the disk unit 124 can include a machine readable mediumon which is stored one or more sets of executable instructions (e.g.,apps) embodying one or more of the methodologies and/or functionsdescribed herein. In place of the drive unit 124, or in addition to thedrive unit, the mobile computing device can include a solid-statestorage device, such as those comprising flash memory or any form ofnon-volatile memory. The term “machine-readable medium” as used hereinshould be taken to include all forms of storage media, either as asingle medium or multiple media, in all forms; e.g., a centralized ordistributed database and/or associated caches and servers; one or morestorage devices, such as storage drives (including e.g., magnetic andoptical drives and storage mechanisms), and one or more instances ofmemory devices or modules (whether main memory, cache storage eitherinternal or external to a processor, or buffers. The term“machine-readable medium” or “computer-readable medium” shall be takento include any tangible non-transitory medium which is capable ofstoring or encoding a sequence of instructions for execution by themachine and that cause the machine to perform any one of themethodologies. The term “non-transitory medium” expressly includes allforms of storage drives (optical, magnetic, etc.) and all forms ofmemory devices (e.g., DRAM, Flash (of all storage designs), SRAM, MRAM,phase change, etc., as well as all other structures designed to storeinformation of any type for later retrieval.

The transceiver 120, which is connected to an antenna 126, can be usedto transmit and receive data wirelessly using, e.g., Wi-Fi, cellularcommunications or mobile satellite communications. The mobile computingdevice 102 may also be able to perform wireless communications usingBluetooth and/or other wireless technologies. It is also possible themobile computing device 102 includes multiple types of transceiversand/or multiple types of antennas.

The speaker 122 can be used to provide auditory instructions, feedbackand/or indicators to a user, playback recordings (e.g., musicalrecordings), as well as to enable the mobile computing device 102 tooperate as a mobile phone.

The processor 116 can be used to control the various other elements ofthe mobile computing device 102, e.g., under control of software and/orfirmware stored in the memory 118 and/or drive unit 124. It is alsopossible that there are multiple processors 116, e.g., a centralprocessing unit (CPU) and a graphics processing unit (GPU).

FIGS. 2A and 2B are, respectively, front and back views of an exemplarysmartphone 202 type of mobile computing device 102 with whichembodiments of the present technology can be used. Referring to FIG. 2A,a front 204 of the smartphone 202 is shown as including a touchscreendisplay 114, a button 210, a speaker 122 and a front-facing camera 104a. Referring to FIG. 2B, a back 206 of the smartphone 202 is shown asincluding a rear-facing camera 104 b and a camera flash 214. Thesmartphone 202, and more generally the mobile computing device 102, caninclude additional buttons on the front, back and/or sides, e.g., forpowering the device on and off, volume control and/or the like. In FIG.2A the front-facing camera 104 a is shown as being centered relative thesides of the smartphone 202. However, depending upon the smartphone 202,and more generally the mobile computing device 102, that is not alwaysthe case. In FIG. 2B the rear-facing camera 104 b is shown as beingcloser to one side of the smartphone than the other, and thus, beingoffset relative to the sides of the smartphone 202. However, dependingupon the smartphone 202, and more generally the mobile computing device102, that is not always the case. The smartphone 202 in FIGS. 2A and 2Bis arranged in what is referring to as a “portrait” position, where theheight is greater than the width. If the smartphone 202 where turnedsideways by 90 degrees then the smartphone 202 would be arranged in whatis referred to as a “landscape” position, where the width is greaterthan the height.

An exemplary block diagram of a 360-degree camera 302 is illustrated inFIG. 3. A description is given hereinafter of an example in which the360-degree camera 302 is a full spherical (omnidirectional) camera thatuses two imaging elements. However, the 360-degree camera 302 mayinclude three or more imaging elements. In addition, the 360-degreecamera 302 is not necessarily omnidirectional. The 360-degree camera 302can be, e.g., the Ricoh™ Theta V™ 4 k 360-degree Spherical Camera, orany one of numerous other 360-degree cameras available from companiessuch as, but not limited to, Ricoh™, Samsung™, LG Electronics™, Garmin™,Kodak™, Inst360™, Sony™, etc.

As illustrated in FIG. 3, the 360-degree camera 302 includes an imagingunit 304, an image processing unit 306, an imaging control unit 308, amicrophone 310, an audio processing unit 312, a processor 316, memory318, a transceiver 320, an antenna 326, an operation unit 322, a networkinterface 324, and an electronic compass 328. The 360-degree camera 302is an example of a type of digital camera, because is captures andstores images in digital memory.

The processor 316 can be used to control the various other elements ofthe 360-degree camera 302, e.g., under control of software and/orfirmware stored in the memory 318. It is also possible that there aremultiple processors 316, e.g., a central processing unit (CPU) and agraphics processing unit (GPU).

The electronic compass 328 can include, e.g., an accelerometer, amagnetometer, and/or a gyroscope, examples of which were discussed abovewith reference to FIG. 1, but are not limited thereto. The electroniccompass 328 can determine an orientation and a tilt (roll angle) of the360-degree camera 302 from the Earth's magnetism to output orientationand tilt information. This orientation and tilt information can berelated information (metadata) described in compliance with Exchangeableimage format (Exif). Further, the orientation and tilt information canbe used for image processing such as image correction of capturedimages. Further, the related information can also include a date andtime when an image is captured by the 360-degree camera 302, and a sizeof the image data.

The imaging unit 304 includes two wide-angle lenses (so-called fish-eyelenses) 305 a and 305 b, each having an angle of view of equal to orgreater than 180 degrees so as to form a hemispherical image. Theimaging unit 304 further includes the two imaging elements 303 a and 303b corresponding to the wide-angle lenses 305 a and 305 b respectively.

The imaging elements 303 a and 305 b include image sensors such as CMOSsensors or CCD sensors, which convert optical images formed by thefisheye lenses 305 a and 305 b respectively into electric signals tooutput image data. Further, the imaging elements 303 a and 303 b caneach include a timing generation circuit, which generates horizontal orvertical synchronization signals, pixel clocks and the like for theimage sensor. Furthermore, the imaging elements 303 a and 303 b can eachinclude a group of registers, in which various commands, parameters andthe like for operations of an imaging element are set.

Each of the imaging elements 303 a and 303 b of the imaging unit 304 isconnected to the image processing unit 306 via a parallel interface bus.In addition, each of the imaging elements 303 a and 303 b of the imagingunit 304 is connected to the imaging control unit 308 via a serialinterface bus such as an I2C bus. The image processing unit 306 and theimaging control unit 308 are each connected to the processor 316 via abus 319. Furthermore, the memory 318, the transceiver 320, the operationunit 322, the network interface 324, and the electronic compass 338 arealso connected to the bus 319.

The image processing unit 306 acquires image data from each of theimaging elements 303 a and 303 b via the parallel interface bus. Theimage processing unit 306 further performs predetermined processing oneach of the acquired image data, and combines these image data. Forexample, data of a “Mercator image” as illustrated, e.g., in FIGS. 5A,5B and 5C, can be generated.

The imaging control unit 308 functions as a master device while theimaging elements 303 a and 303 b each functions as a slave device. Theimaging control unit 308 sets commands and the like in the group ofregisters of the imaging elements 303 a and 303 b via a bus. The imagingcontrol unit 308 receives commands from the processor 316. Further, theimaging control unit 308 acquires status data to be set in the group ofregisters of the imaging elements 303 a and 303 b using a bus. Theimaging control unit 308 sends the acquired status data to the processor316.

The imaging control unit 308 can instruct the imaging elements 303 a and303 b to output the image data in response to a shutter button of theoperation unit 322 being pressed, or in response to control signalsreceived from another device, such as a smartphone type of mobilecomputing device (e.g., 102, or 202), but is not limited thereto.

The 360-degree camera 302 may display a preview image on a display.Furthermore, the imaging control unit 308 operates in cooperation withthe processor 316 to synchronize times when the imaging elements 303 aand 303 b output the image data. The 360-degree camera 302 may include adisplay unit, such as a display.

The microphone 310 converts sound to audio data (signal). The audioprocessing unit 312 acquires the audio data from the microphone 310 viaan interface bus and performs predetermined processing on the audiodata.

The processor 316 controls an entire operation of the 360-degree camera302. Further, the processor 316 executes processes performed by the360-degree camera 302. The memory 318 can include, e.g., read onlymemory (ROM), a static random access memory (SRAM), and/or dynamicrandom access memory (DRAM). ROM can store various programs to enablethe processor 316 to execute processes. SRAM and DRAM can operate aswork memory to store programs loaded from ROM for execution by theprocessor 316 or data in current processing. More specifically, DRAM canstore image data currently processed by the image processing unit 306and data of a Mercator image on which processing has been performed.

The operation unit 322 can include various operation keys, a powerswitch, the shutter button, and a touch panel having functions of bothdisplaying information and receiving input from a user, which may beused in combination. A user can operate the operation keys, etc. toinput various photographing modes or photographing conditions to the360-degree camera.

The network interface 324 collectively refers to an interface circuitsuch as a USB interface that allows the 360-degree camera 302 tocommunicate data with an external media such as an SD card or anexternal device. The network interface 324 connects the 360-degreecamera to an external device, etc., though either wired or wirelesscommunication. For an example, data of a Mercator image, which is storedin DRAM, can be stored in an external media via the network interface324 or transmitted to an external apparatus such as a smartphone via thenetwork interface 324.

The transceiver 320 can communicate with an external device via theantenna 326 of the 360-degree camera by Wi-Fi, or by near distancewireless communication such as Near Field Communication (NFC), orBluetooth, but is not limited thereto. Such communications can be usedby the 360-degree camera 302 to transmit the data (e.g., of a Mercatorimage) to an external device using the transceiver 320. Such an externaldevice can be, e.g., a smartphone type mobile computing device (e.g.,102, 202), but is not limited thereto.

FIG. 4 is used to illustrate that the mobile computing device 102 (suchas the smartphone 202), and a 360-degree camera 302, can use acommunication network 402 to upload data to, and download data from, aremote system 412 that includes one or more servers 422. Preferably, themobile computing device 102 and the 360-degree camera 302 can achievesuch uploading and downloading wirelessly. Various communicationprotocols may be used to facilitate communication between the variouscomponents shown in FIG. 4. These communication protocols may include,for example, TCP/IP, HTTP protocols, Wi-Fi protocols, wirelessapplication protocol (WAP), vendor-specific protocols, customizedprotocols, but are not limited thereto. While in one embodiment,communication network 402 is the Internet, in other embodiments,communication network 402 may be any suitable communication networkincluding a local area network (LAN), a wide area network (WAN), awireless network, an intranet, a private network, a public network, aswitched network, and combinations of these, and the like. It is alsopossible that the mobile computing device 102 (such as the smartphone202) and the 360-degree camera 302 communicate directly with oneanother, without requiring a communication network.

The distributed computer network shown in FIG. 4 is merely illustrativeof a computing environment in which embodiments the present technologycan be implemented, but is not intended to limit the scope of theembodiments described herein. One of ordinary skill in the art wouldrecognize other variations, modifications, and alternatives. Forexample, the various servers 422 may be distributed. In other words, theremote system 412 can be a distributed system. Further, the servers caninclude or have access to databases and/or other types of data storagecomponents, each of which can be considered part of the remote system412. In accordance with certain embodiments, the mobile computing device102 can upload data to the remote system 412 so that the remote systemcan generate 3D models based on the uploaded data, and the remote system412 can download data to the mobile computing device 102 so that themobile computing device 102 can display 3D models to a user of themobile computing device.

FIG. 5A shows an exemplary image 502A captured using a 360-degreecamera, such as the 360-degree camera 302 described above with referenceto FIG. 3. Such an image can be used, e.g., to generate a schematic,blueprint or other graphical representation of the room captured in theimage 502A. However, because a portion 503A of the image 502A includesthe person that is controlling the 360-degree camera (or some otherperson), a portion of the room is blocked by the person. More generally,a portion 504A of the scene shown in the image 502A is blocked by theperson. Certain embodiments of the present invention, initiallydescribed with reference to the high level flow diagram of FIG. 6, canbe used to essentially remove a person from an image, or morespecifically, to generate an image that is devoid of a person. Whilesuch embodiments are especially useful with a 360-degree camera, suchembodiments are also useful with cameras having other FOVs, such as aFOV that is 120 degrees or 180 degrees, but not limited thereto.

Referring to FIG. 6, step 602 involves obtaining a first image of ascene within the FOV of a camera while a person is at a first locationwithin the FOV of the camera, and thus, the person appears in a firstportion of the first image. The first image is also referred to hereinas image A, or as the first image (A). The image 502A in FIG. 5Aillustrates an example of the first image (A) Step 604 involvesobtaining a second image of the scene within the FOV of the camera whilethe person is at a second location (within the FOV of the camera) thatdiffers from the first location, and thus, the person appears in asecond portion of the second image that differs from the first portionof the first image. The second image is also referred to herein as imageB, or as the second image (B). The image 502B in FIG. 5B illustrates anexample of the second image (B). Referring again to FIG. 6, a visualand/or audio instruction can be provided to the person to tell them tomove before the second image is capture, wherein the capturing of thesecond image (B) can be performed under manual control by the person, orautomatically by an application running on a mobile computing device, orthe like. The person should move far enough such that when the secondimage (B) of the scene is taken, the person is no longer blocking theportion of the scene that they were blocking in the first image (A).Accordingly, the instruction may state that the person should move atleast a specified distance (e.g., 5 feet, or the like). The capturing ofthe second image (B) by the camera (e.g., a 360-degree camera) can betriggered in response to the person pressing a certain button on theirsmartphone type mobile computing device, after the person has moved atleast the specified distance relative to where they were when the firstimage (A) was captured. Alternatively, an application which implementsthis technology, which is running on the smartphone type mobilecomputing device, can automatically detect when the person has moved asufficient distance and can trigger the capturing of the second image.In still another alternative, triggering capture of the second image canbe based on a timer, e.g., such that the second image (B) is captured aspecified time (e.g., 30 seconds, 1 minute, 90 seconds, or 2 minutes)after the first image (A) was captured. Other variations are alsopossible and within the scope of the embodiments described herein.Further, it is noted that other types of mobile computing devices, suchas a tablet computing device, can be used in place of a smartphone typemobile computing device.

Still referring to FIG. 6, step 606 involves generating, based on theimages A and B, a third image of the scene, such that the image third ofthe scene is devoid of the person and includes portions of the scenethat were blocked by the person in the images A and B. The third imageis also referred to herein as image C, or as the third image (C). Inaccordance with an embodiment, step 606 is performed using one or moreprocessors. The image 502C in FIG. 5C illustrates an example of thethird image (C).

An exemplary implementation of step 606 is described with reference tothe flow diagram of FIG. 7, which includes steps 702 and 704. In otherwords, the third image (C) that is generated at step 606 can begenerating by performing sub-steps 702 and 704. Referring to FIG. 7,step 702 involves using computer vision to identify the person withineach of the first and second images (A and B). There exist well knowncomputer vision techniques for recognizing humans within images, andthus, such techniques need not be described herein. Computer visiontechniques that detect differences between multiple images canadditional or alternatively be used. Still referring to FIG. 7, step 704involves combining a portion of the first image (A) that is devoid ofthe person with a portion of the second image (B) that is devoid of theperson to produce the third image (C) of the scene that is devoid of theperson and includes the portions of the scene that were blocked by theperson in the first and second images (A and B).

FIG. 8A shows an example of the image A, after the person in the imagehas been identified within the image and the area around the person hasbeen expanded, i.e., dilated. The resulting image is labeled 802A. Theblob 804A shown in the image 802A is where the person was located, andthus, a portion of the scene where the blob is located is missing. FIG.8B shows an example of the image B, after the person in the image hasbeen identified within the image and the area around the person has beenexpanded, i.e., dilated. The resulting image is labeled 802B. The blob804B shown in the image 802A is where the person was located, and thus,a portion of the scene where the blob is located is missing. The image802C shown in FIG. 8C, which is the same as the image 502C shown in FIG.5C, is an example of the combined image devoid of the person.

Another exemplary implementation of step 606 is described with referenceto the flow diagram of FIG. 9, which includes steps 902 through 916. Inother words, the third image (also known as image C) that is generatedat step 606 can be generating by performing sub-steps 902 through 916.The embodiment described with reference to FIG. 9 takes advantage of theassumption that when a person is located in an image, the portion of theimage occupied by the person will be quite different than a remainingportion of an image (e.g., a portion of the image surrounding theperson). Referring to FIG. 9, step 902 involves identifying first andsecond portions (A1, A2) of the first image (A) that differ from thesecond image (B), and step 904 involves identifying first and secondportions (B1, B2) of the second image (B) that differ from the firstimage (A). The order of steps 902 and 904 can be reversed, or thesesteps can be performed at the same time. Referring to FIGS. 10A and 10B,the image 1002A is an example of the first image (A), the image 1002B isan example of the second image (B). The labels A1 and A2 in FIG. 10A areexamples of first and second portions (A1, A2) of the first image (A)that differ from the second image (B). The labels B1 and B2 in FIG. 10Bare examples of the first and second portions (B1, B2) of the secondimage (B) that differ from the first image (A).

Step 906 involves determining a first metric of similarity (a1)indicative of similarity between the first portion (A1) of the firstimage that differs from the second image (B) and a remaining portion ofthe first image (A). The remaining portion of the first image (A)referred to in step 906 can be, e.g., a portion of the first image (A)surrounding the first portion (A1), or an entirety of the first image(A) besides the first portion (A1), but is not limited thereto. Step 908involves determining a second metric of similarity (a2) indicative ofsimilarity between the second portion (A2) of the first image (A) thatdiffers from the second image (B) and a remaining portion of the firstimage (A). The remaining portion of the first image (A) referred to instep 908 can be, e.g., a portion of the first image (A) surrounding thesecond portion (A2), or an entirety of the first image (A) besides thesecond portion (A2), but is not limited thereto. The order of steps 906and 908 can be reversed, or these steps can be performed at the sametime.

Step 910 involves determining a third metric of similarity (b 1)indicative of similarity between the first portion (B1) of the secondimage (B) that differs from the first image (A) and a remaining portionof the second image (B). The remaining portion of the second image (B)referred to in step 910 can be, e.g., a portion of the second image (B)surrounding the first portion (B1), or an entirety of the second image(B) besides the first portion (B1), but is not limited thereto. Step 912involves determining a fourth metric of similarity (b2) indicative ofsimilarity between the second portion (B2) of the second image (B) thatdiffers from the first image (A) and a remaining portion of the secondimage (B). The remaining portion of the second image (B) referred to instep 912 can be, e.g., a portion of the second image (B) surrounding thesecond portion (B2), or an entirety of the second image (B) besides thesecond portion (B2), but is not limited thereto. The order of steps 910and 912 can be reversed, or these steps can be performed at the sametime. It would also be possible that steps 910 and 912 be performedbefore steps 906 and 908, or that all of these steps be performed at thesame time. Other variations are also possible.

Still referring to FIG. 9, step 914 involves determining, based on thefirst, second, third, and fourth metrics of similarity (a1, a2, b1, b2),which one of the first portion (A1) of the first image (A) and the firstportion (B1) of the second image (B) is to be included the third image(C), and which one of the second portion (A2) of the first image (A) andthe second portion (B2) of the second image (B) is to be included thethird image (C). Finally, at step 916, the third image (C) is generatedbased on the results of step 914.

Step 914 can include comparing a sum of the first and fourth metrics(a1+b2) to a sum of the second and third metrics (a2+b3), anddetermining, based on results of the comparing, which one of the firstportion (A1) of the first image (A) and the first portion (B1) of thesecond image (B) is to be included the third image (C), and which one ofthe second portion (A2) of the first image (A) and the second portion(B2) of the second image (B) is to be included the third image (C). Inaccordance with certain embodiments, for each of the first, second,third, and fourth metrics of similarity (a1, a2, b1, b2), a lowermagnitude is indicative of higher similarity, and higher magnitude isindicative of a lower similarity. Step 914 can involve determiningwhether the sum of the first and fourth metrics (a1+b2) is less than orgreater than the sum of the second and third metrics (a2+b3). Inembodiments where a lower magnitude is indicative of higher similarity(and higher magnitude is indicative of a lower similarity), in responseto determining that the sum of the first and fourth metrics (a1+b2) isless than the sum of the second and third metrics (a2+b3), it can bedetermined that the first portion (A1) of the first image (A) and thesecond portion (B2) of the second image (B) is to be included in thethird image (C). In response to determining that the sum of the firstand fourth metrics (a1+b2) is greater than the sum of the second andthird metrics (a2+b3), it can be determined that the second portion (A2)of the first image (A) and the first portion (B1) of the second image(B) is to be included in the third image (C).

In accordance with other embodiments, for each of the first, second,third, and fourth metrics of similarity (a1, a2, b1, b2), a highermagnitude is indicative of higher similarity, and lower magnitude isindicative of a lower similarity. Step 914 can involve determiningwhether the sum of the first and fourth metrics (a1+b2) is less than orgreater than the sum of the second and third metrics (a2+b3). Inembodiments where a higher magnitude is indicative of higher similarity(and lower magnitude is indicative of a lower similarity), in responseto determining that the sum of the first and fourth metrics (a1+b2) isgreater than the sum of the second and third metrics (a2+b3), it can bedetermined that the first portion (A1) of the first image (A) and thesecond portion (B2) of the second image (B) are to be included in thethird image (C). In response to determining that the sum of the firstand fourth metrics (a1+b2) is less than the sum of the second and thirdmetrics (a2+b3), it can be determined that the second portion (A2) ofthe first image (A) and the first portion (B1) of the second image (B)are to be included in the third image (C).

Further embodiments of the present technology, described below, enableitems (e.g., that need to be repaired) to be easily and readily taggedwithin a schematic, blueprint or other graphical representation of aregion (e.g., room) in a manner the provides good specificity, e.g., toclearly indicate which one of numerous cabinet pulls in a kitchen needsto be repaired, or to clearly indicate which one of a number of doorknobs in a bedroom needs to be repaired. Such embodiments can beperformed using a 360-degree camera (or some other type of camera, suchas a 270-degree camera, or 180-degree camera, but not limited thereto)and a smartphone type mobile computing device (or some other type ofmobile computing device) that are in communication with one another(e.g., via a wireless communication link, or the like). Such embodimentsenable spatial information (e.g., spatial coordinates) to be determinedfor one or more items of interest within a graphical representation of aregion (e.g., room) generated based on one or more images of the regioncaptured using the 360-degree camera (or some other type of camera).Such a graphical representation of the region can be a two-dimensionalrepresentation, e.g., a 2D floor plan, or a three-dimensionalrepresentation, e.g., a 3D floor plan or 3D representation of a portionof a region (e.g., room). Such embodiments can include capturing one ormore images of the region (e.g., room) using the 360-degree camera, andgenerating or otherwise obtaining a graphical representation of theregion (e.g., room) based on the one or more images of the region (e.g.,room) captured using the 360-degree camera (or other type of camera).The images captured by the 360-degree camera can be transferred to themobile computing device (e.g., smartphone) using a Wi-Fi network,Bluetooth communication, or some other wireless or wired communication.An application installed on the mobile computing device can generate agraphical representation (e.g., 3D graphical representation) of theregion, or the mobile computing device can use a communication networkto transfer the image(s) of the region to a remote system (e.g., 412 inFIG. 4) that generates the graphical representation. For each item ofinterest, of the one or more items of interest (e.g., a broken doorhandle, a broken cabinet pull, and a worn out section of carpeting), the360-degree camera can be used to capture one or more further images ofthe region (e.g., room) while the mobile computing device (e.g.,smartphone) is placed in close proximity to the item of interest, andthus, the mobile computing device (e.g., smartphone) appears in the oneor more further images. Then, for each item of interest (of the one ormore items of interest), spatial information (e.g., spatial coordinates)is determined for the item of interest based on the one or more furtherimages of the region within which the first mobile device appears. Anapplication that implements and controls aspects of this embodiment canbe installed on the mobile computing device (e.g., smartphone), whichcan control the 360-degree camera (or some other camera).

FIG. 11 shows an example of an image 1102 of a room where a smartphone1106 is placed in close proximity to an item of interest (a portion ofdoor trim in this example), and thus, the smartphone 1106 appears in theimage 1102. To capture the location of the items of interest, the mobilecomputing device (e.g., smartphone) is used as a marker that can bedetected in one or more image(s) captured by the 360-degree camera thatcaptured the image(s). The image 1102 shown in FIG. 11 can be capturedby a 360-degree camera (e.g., 302 in FIG. 3), which is not shown in FIG.11, but is presumed to be placed near the center of the room shown inthe image 1102. More generally, the 360-degree camera (or other camera)can be placed in a known or later to be calculated central position inthe space (e.g., room) where items are to be located. The application onthe mobile computing device (e.g., smartphone) is connected to the360-degree camera (or other camera) and can read images from the360-degree camera.

Referring to FIG. 12, the application can display an arrow 1208, shape,pattern (e.g., QR code) or other identifiable indicator, potentiallyanimated, on the display of the mobile computing device (e.g.,smartphone). The identifiable indicator (e.g., arrow 1208) that can beused to detect the position in the image from the 360-degree camera onthe mobile device display with high contrast. For example, in FIGS. 11and 12, the person (also referred to as user) holds the smartphone 1106pointing the arrow 1209 at the item of interest. The smartphone 1106obtains the image from the 360-degree camera and detects the smartphone(and more specifically the identifiable indicator, e.g., arrow 1208) inthe image. In a portion 1210 (e.g., lower portion) of the display on thesmartphone 1106, the image from the 360-degree camera is shown with thedetected position marked. This allows for the user to make sure theposition is pointed to and detected correctly. Instead of (or inaddition to) detecting the smartphone or identifiable indicator (e.g.,arrow 1208) within the image captured by the 360-degree camera, andusing the detected smartphone or identifiable indicator to determine thespatial coordinates for the item of interest, the arm or another bodypart of the person holding the smartphone can be detected within theimage captured by the 360-degree camera and used to determine thespatial coordinates for the item of interest. For example, if thesmartphone appears very small in the image captured by the 360-degreecamera, and is thus not readily identifiable, the person's arm caninstead be identified and it can be assumed that the item of interest isin close proximity to the person's arm.

The user then presses a button to capture the location. In accordancewith certain embodiments, when a predetermined button is pressed on thesmartphone (or tablet computing device), the 360-degree camera capturesthe image that includes the smartphone (or tablet computing device) andthe image is saved (e.g., in the memory of the smartphone or tabletcomputing device) and the marked position is saved as a location. Laterin post processing the location of the item of interest in the imagecaptured by the 360-degree camera can be positioned in 3D space byintersecting a ray from the 360-degree camera center to the item ofinterest (or more specifically, a smartphone, identifiable indicator, orbody part near the item of interest) within the 3D geometry. The postprocessing can be performed within the mobile computing device (e.g.,smartphone), or by remote system (e.g., 412 in FIG. 4), or a combinationthereof, but is not limited thereto. Example additional details of thesmartphone 1106 discussed with reference to FIGS. 11 and 12 can beappreciated from the above discussion of the mobile computing device 102and the smartphone 202 discussed above with reference to FIGS. 1, 2A,and 2B.

In accordance with certain embodiments, at the same time (orsubstantially the same time) that the 360-degree camera (or othercamera) captures the image that includes the smartphone, in response tothe predetermined button on the smartphone being pressed, a front-facingcamera (e.g., 104 a in FIG. 2A) of the smartphone also captures an imagethat includes the 360-degree camera (or other camera), and both of theseimages are saved (e.g., in the memory of the smartphone) and the markedposition is saved as the location. In accordance with certainembodiments, this image from the front-facing camera is used to increasethe reliability and accuracy of the determined spatial information,e.g., by matching the image from the smartphone against the imageobtained from the 360-degree camera (or other camera) and calculating arelative pose using computer vision. Such matching can be achieved bydetecting features in the two images (one captured by the smartphone,the other captured by 360-degree camera or other camera), matching thedetected features, and then using a robust method to calculate therelative pose between the different cameras. Such a robust method couldbe 5-point relative pose solver in a RANSAC (random sample consensusscheme) loop, but is not limited thereto.

In accordance with certain embodiments, computer vision is used to findthe location of the person in the image captured by the 360-degreecamera to get a rough spatial location. The images from the smartphoneand the 360-degree camera can collectively be used to figure out wherethe smartphone (or tablet computing device) was pointed and the spatiallocation.

The items of interest can be, e.g., included in an inspection list orpunch list of items that need to be logged and/or repaired. For example,assume a person is renting a rental unit, and that during inspection,the unit gets imaged using a 360-degree camera. Using embodiments of thetechnology, spatial coordinates of the inspection items or punch listitems can be generated to reduce any ambiguities as to what items aredamaged and/or need to be repaired. For an example, a floor plan can begenerated from one or more images captured by the 360-degree camera. Thelocation of the 360-degree camera that captures the images used togenerate the 3D representation (e.g., model) can be assumed to be 0, 0(in just x and y coordinates), or can be 0, 0, 0 (if also including a zcoordinate), which is presumably at or near a center of a room. From anelectronic compass, or the like, directions, such as North, East, South,and West can be determined. Images can be annotated with labels or othermetadata, to specify what rooms, or the like, they correspond to, suchas a kitchen, bathroom, living room, bedroom, etc.

Embodiments of the present technology can be used to add and displaymarkers on a floor plan that indicates one or more items of interestalong with notes, such as the carpet is very warn at this location. Anexample of a floor plan 1402 that can be generated from images capturedby a 360-degree camera (or other camera) is shown in FIG. 14, whereinthe floor plan 1402 is annotated with markers 1404 to show certain itemsof interest, in accordance with certain embodiments of the presenttechnology. In the example shown in FIG. 14, each of the markers 1404 isnumbered (1, 2, 3, 4, 5), with the numbers of the markers 1404corresponding to one of the items of interest in the list 1406 shown inthe upper right of FIG. 14.

In accordance with certain embodiments, spatial coordinate are added toitems of interest. The spatial coordinates can be x and y coordinatesrelative to a center of a room (or more generally, a location of the360-degree camera), and can also include a z coordinate for height. Incertain embodiments, a coordinate system can be translated into GPScoordinates, e.g., if uploaded to Google Earth, or the like, which canchange the spatial coordinates to values of longitude, latitude, andaltitude. In certain embodiments, the 0, 0 coordinates can be a cornerof a room (or other region) rather than the center, depending on what ischosen by a user. A coordinate system can be adjusted or translated toanother coordinate system, by adding or subtracting as appropriate. Thejust described embodiments can be used together with one of the abovedescribed embodiments that removes the photographer or other person froman image. For example, a 360-degree camera can be placed in the centerof a room, and two images of the room can be captured, were the personis at different locations in the two images. The two captured images ofthe room can be processed to get rid of the photographer. Items ofinterest can then be identified and added to a 3D graphicalrepresentation of the room that is generated based on the image(s) ofthe room. Assume, for example, that a person wants to indicate that aspecific door handle on a specific kitchen door needs to be fixed. Withan application running on a smartphone or other mobile computing device,the person can indicate that they want to add an inspection item thatindicates that this door handle needs to be fixed, and then the personcan hold up the smartphone so that the screen of the smartphone can beseen by the 360-degree camera. In accordance with certain embodiments,the person can then press a button on the app/smartphone that causes atleast two (and possibly three) things to be performed substantiallysimultaneously, including: 1) optionally capture an image of the handleusing the rear-facing camera of the smartphone; 2) display arecognizable indicator (e.g., the arrow 1208) on the display of thesmartphone; and 3) capture an image of the room (or at least a portionthereof) using the 360-degree camera with the recognizable indicator(e.g., the arrow 1208) on the display of the smartphone and thusincluded in the captured image. This can result in an image of thebroken handle being captured by the smartphone, and also an image ofroom (or at least a portion thereof) being captured by the 360-degreecamera with the recognizable indicator included in captured image ofroom. In alternative embodiments, these two things need not occur at thesame time. This technology enables inspection items and/or other typesof items of interest to be automatically identified and spatialcoordinates thereof generated using an application. The captured imageof the item of interest captured using the rear-facing camera of thesmartphone can be used solely for documentation, or can be used toincrease the accuracy of the spatial coordinates of the item ofinterest, e.g., by matching/finding the image captured by therear-facing camera of the smartphone in the image of the room capturedby the 360-degree camera.

Such embodiments of the present technology can be used for otherpurposes besides marking items that need to be repaired and generating alist of such items. For example, such embodiments can be used to addsmart tags within a graphical representation of a region (e.g., room),such that when a smart tag is selected by a user it provides additionalinformation to the user. For example, smart tags can mark various itemswithin a graphical representation of a house that is for sale, so that apotential buyer and/or appraiser can learn more information about suchitems, such as, but not limited to, appliances, countertops, and/or thelike.

In certain embodiments, a house or rental unit can be imaged and smarttags can be added to a 3D graphical representation of the house orrental unit to specify items that should be repaired. Then, after theitems are supposedly repaired, the house or rental unit can again beimaged and the new 3D graphical representation of the house or rentalunit can be overlaid on the original representation to check to see ifitems that were supposed to be repaired were actually repaired.

FIG. 13 is a high level flow diagram that is used to summarizeembodiments of the present technology introduced above with reference toFIGS. 11 and 12. More specifically, FIG. 13 is used to summarize methodsaccording to certain embodiments of the of the present technology thatcan be used to determine spatial information for one or more items ofinterest within a graphical representation of a region generated basedon one or more images of the region captured using a camera of a mobiledevice. Referring to FIG. 13, step 1302 involves capturing one or moreimages of a region (e.g., room) using a first camera of a first mobiledevice. The images capture at step 1302 can be captured, e.g., by a306-degree camera, but are not limited thereto.

Step 1304 involves generating or otherwise obtaining a graphicalrepresentation of the region (e.g., room) based on the one or moreimages of the region captured using the first camera of the first mobiledevice. Where the graphical representation of the region isthree-dimensional, the three-dimensional graphical representation of theregion can be generated using structure from motion (SfM) techniques, orany other known or futured developed techniques that can be used togenerate a graphical representation of a region based on images of theregion. In certain embodiments, the graphical representation of theregion can be generated by a second mobile device, e.g., a smartphone ortablet computing device, that receives the images captured at step 1302via a Wi-Fi network, Bluetooth communication, or some other wireless orwired communication. An application installed on the second mobiledevice, e.g., a smartphone or tablet computing device, can generate thegraphical representation of the region. Alternatively, the second mobiledevice (or the first mobile device) can use a communication network totransfer the image(s) of the region to a remote system (e.g., 412 inFIG. 4) that generates the graphical representation, which may or maynot be three-dimensional, depending upon the implementation.

Still referring to FIG. 13, step 1306 involves for each item ofinterest, of the one or more items of interest, using the first cameraof the first mobile device (e.g., a 360-degree camera) to capture one ormore further images of the region while the second mobile device (e.g.,a smartphone or tablet computing device) is placed in close proximity tothe item of interest, and thus, the second mobile device appears in theone or more further images. Further, step 1308 involves, for each itemof interest, of the one or more items of interest, determining spatialinformation (e.g., coordinate) for the item of interest based on the oneor more further images of the region within which the second mobiledevice appears. Step 1310 involves displaying a graphical representationof region that includes markers for and/or a list of the one or moreitems of interest, and optionally spatially information for each of theone or more items of interest. FIG. 15 shows an example of athree-dimensional (3D) graphical representation of a region (a wall inthis example) that includes a marker 1504 (a 3D pin in this example)specifying precisely where a drywall repair is needed.

As can be appreciated from the above discussion of FIGS. 11 and 12,where the second mobile device is a smartphone or tablet type of mobilecomputing device, it can include a front side on which is located adisplay and a front side camera, and a back side on which is located aback side camera. Further, as can be appreciated from the abovediscussion of FIG. 12, an indicator (e.g., arrow, QR code, etc.) can bedisplayed on the display of the second mobile device, such that theindicator will be shown in the one or more images of the region capturedusing the first camera of the first mobile device. The spatialinformation for an item of interest can be determined by intersecting aray from a center of the first camera of the first mobile device (e.g.,a 360-degree camera) toward the second mobile device (e.g., asmartphone) that appears near the item of interest within the geometryof the space (e.g., a three-dimensional representation of the region).This enables the item of interest (within the graphical representationof the region) to be identified based on the indicator (e.g., arrow, QRcode, etc.) on the display of the second mobile device included in theone or more images of the region captured using the first camera of thefirst mobile device.

In certain embodiments, the frontside camera of the second mobile device(e.g., smartphone) is used to capture a further image that includes thefirst mobile device, and the further images is used to the reliabilityand/or accuracy of the spatial information determined for the item ofinterest. Using computer vision the frontside image (e.g., capturedusing a smartphone) can be matched against the image from the 360-degreecamera (or other type of wide FOV camera) to detect feature pointmatches. These feature point matches can then be used to calculate arelative pose between the two images. This information can be used toenhance spatial information and make the spatial information moreaccurate. The relative pose between the two images can be determinedusing a 5-point relative pose solver in a RANSAC (random sampleconsensus scheme) loop, but is not limited thereto. The aforementioneditems of interest can be, e.g., items that need to be repaired.Alternatively, or additionally, the items of interest can be items forwhich there is a desire to add smart tags within a 3D graphicalrepresentation of a house, rental unit or other geographic region.

A person can be provided with the option of naming an item of interest,e.g., using a touchscreen or other user interface of the second mobiledevice, right before each instance of step 1306, or right after eachinstance of step 1306. Alternatively, the user can be provided with theoption of naming the various items of interest after step 1308 isperformed, and a list of items of interest can be generated betweensteps 1306 and 1308, or following step 1308, depending upon the specificimplementation. Other variations are also possible and within the scopeof the embodiments described herein.

In accordance with certain embodiments, various features and functionsdescribed herein can be performed under the control of a mobileapplication that is downloaded to, stored on, and executed by the mobilecomputing device 102. For example, where the mobile computing device 102is a smartphone or tablet computing device, various features describedherein can be performed under the control of a mobile application, whichis also known as a mobile app, or simply an app. Such a mobileapplication can be available for download from an application store ordirectly from a software vender, for free, or for a fee. In accordancewith certain embodiments of the present technology, the mobileapplication controls aspects of both the mobile computing device 102 andthe remote camera (e.g., 360-degree camera) with which the mobilecomputing device communicates (e.g., via a wireless or wiredcommunication link), to thereby cause images and corresponding metadatato be captured and stored for use in producing a 3D representation of aroom or other environment with spatial coordinates and potentially otherinformation about the items of interest made available and accessible.

In much of the discussion above, there was a description of a smartphonetype of mobile computing device communicating with and controlling a 360degree camera. Nevertheless, it is noted that other types of mobilecomputing devices can be used instead of a smartphone type of mobilecomputing device. For just one example, a tablet type of mobilecomputing device can be used instead of a smartphone type of mobilecomputing device. Further, other types of cameras can be used instead ofa 360 degree camera. For example, such alternative cameras can have FOVsthat are less than 360 degrees, e.g., 180 degrees or 120 degrees, butare not limited thereto.

The terms “imaging” and “capturing”, as used herein, are usedinterchangeably typically to refer to the obtaining or taking of imagesusing a camera of a 360-degree camera, other camera, or a mobilecomputing device. Further, if a room (or a portion thereof) has alreadybeen “imaged” or “captured”, that means images for that room (or aportion thereof) have already been obtained using the 360-degree camera(or other camera). Such images can be stored, e.g., in the JPEG fileformat, or some alternative file formal, such as, but not limited to,Exif, TIFF, RAW, GIF, BMP, PNG, PPM, PAM, or WEBP.

A 3D representation (e.g., model) of a room or other environment can beproduced by the mobile computing device 102 based on images of the roomor other environment capture by the 360-degree camera 302 (or othercamera). Alternatively, obtained images and metadata corresponding tothe images an be uploaded to a remote system (e.g., 312 in FIG. 2) thatincludes software (e.g., SfM software) and sufficient processingresources to generate a 3D model of a room based on images of the roomwithin a relatively short period of time.

The disclosure has been described in conjunction with variousembodiments. However, other variations and modifications to thedisclosed embodiments can be understood and effected from a study of thedrawings, the disclosure, and the appended claims, and such variationsand modifications are to be interpreted as being encompassed by theappended claims.

In the claims, the word “comprising” does not exclude other elements orsteps, and the indefinite article “a” or “an” does not exclude aplurality. A single processor or other unit may fulfill the functions ofseveral items recited in the claims. The mere fact that certain measuresare recited in mutually different dependent claims does not indicate,preclude or suggest that a combination of these measures cannot be usedto advantage.

A computer program may be stored or distributed on a suitable medium,such as an optical storage medium or a solid-state medium suppliedtogether with, or as part of, other hardware, but may also bedistributed in other forms, such as via the Internet or other wired orwireless telecommunication systems.

It is understood that the present subject matter may be embodied in manydifferent forms and should not be construed as being limited to theembodiments set forth herein. Rather, these embodiments are provided sothat this subject matter will be thorough and complete and will fullyconvey the disclosure to those skilled in the art. Indeed, the subjectmatter is intended to cover alternatives, modifications and equivalentsof these embodiments, which are included within the scope and spirit ofthe subject matter as defined by the appended claims. Furthermore, inthe above detailed description of the present subject matter, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present subject matter. However, it will be clearto those of ordinary skill in the art that the present subject mattermay be practiced without such specific details.

Aspects of the present disclosure are described herein with reference toflow diagrams and/or block diagrams of methods, apparatuses (systems)and computer program products according to embodiments of thedisclosure. It will be understood that each block of the flow diagramsin (e.g., in FIGS. 6, 7 and 9) and/or many of the blocks in the blockdiagrams (e.g., in FIGS. 1, 3 and 4), and combinations of blocks in theflow diagrams and/or block diagrams, can be implemented by computerprogram instructions. These computer program instructions may beprovided to one or more processors of a general purpose computer,special purpose computer, or other programmable data processingapparatus to produce a machine, such that the instructions, whichexecute via the processor(s) of the computer or other programmableinstruction execution apparatus, create a mechanism for implementing thefunctions/acts specified in the flow diagram and/or block diagram blockor blocks.

The computer-readable non-transitory media includes all types ofcomputer readable media, including magnetic storage media, opticalstorage media, and solid state storage media and specifically excludessignals. It should be understood that the software can be installed inand sold with the device. Alternatively the software can be obtained andloaded into the device, including obtaining the software via a discmedium or from any manner of network or distribution system, including,for example, from a server owned by the software creator or from aserver not owned but used by the software creator. The software can bestored on a server for distribution over the Internet, for example.

Computer-readable storage media (medium) exclude (excludes) propagatedsignals per se, can be accessed by a computer and/or processor(s), andinclude volatile and non-volatile internal and/or external media that isremovable and/or non-removable. For the computer, the various types ofstorage media accommodate the storage of data in any suitable digitalformat. It should be appreciated by those skilled in the art that othertypes of computer readable medium can be employed such as zip drives,solid state drives, magnetic tape, flash memory cards, flash drives,cartridges, and the like, for storing computer executable instructionsfor performing the novel methods (acts) of the disclosed architecture.

For purposes of this document, it should be noted that the dimensions ofthe various features depicted in the figures may not necessarily bedrawn to scale.

For purposes of this document, reference in the specification to “anembodiment,” “one embodiment,” “some embodiments,” or “anotherembodiment” may be used to describe different embodiments or the sameembodiment.

For purposes of this document, a connection may be a direct connectionor an indirect connection (e.g., via one or more other parts). In somecases, when an element is referred to as being connected or coupled toanother element, the element may be directly connected to the otherelement or indirectly connected to the other element via interveningelements. When an element is referred to as being directly connected toanother element, then there are no intervening elements between theelement and the other element. Two devices are “in communication” ifthey are directly or indirectly connected so that they can communicateelectronic signals between them.

For purposes of this document, the term “based on” may be read as “basedat least in part on.”

For purposes of this document, without additional context, use ofnumerical terms such as a “first” object, a “second” object, and a“third” object may not imply an ordering of objects, but may instead beused for identification purposes to identify different objects.Similarly, a “first” user, a “second” user, and a “third” user may notimply an ordering of users, but may instead be used for identificationpurposes to identify different users.

For purposes of this document, the term “set” of objects may refer to a“set” of one or more of the objects.

The foregoing detailed description has been presented for purposes ofillustration and description. It is not intended to be exhaustive or tolimit the subject matter claimed herein to the precise form(s)disclosed. Many modifications and variations are possible in light ofthe above teachings. The described embodiments were chosen in order tobest explain the principles of the disclosed technology and itspractical application to thereby enable others skilled in the art tobest utilize the technology in various embodiments and with variousmodifications as are suited to the particular use contemplated. It isintended that the scope be defined by the claims appended hereto.

The previous description of the preferred embodiments is provided toenable any person skilled in the art to make or use the embodiments ofthe present invention. While the invention has been particularly shownand described with reference to preferred embodiments thereof, it willbe understood by those skilled in the art that various changes in formand details may be made therein without departing from the spirit andscope of the invention.

What is claimed:
 1. A method for use with a camera having a field ofview (FOV), the method comprising: obtaining a first image (A) of ascene within the FOV of the camera while a person is at a first locationwithin the FOV of the camera, and thus, the person appears in a firstportion of the first image (A); obtaining a second image (B) of thescene within the FOV of the camera while the person is at a secondlocation within the FOV of the camera that differs from the firstlocation, and thus, the person appears in a second portion of the secondimage (B) that differs from the first portion of the first image (A);and generating, based on the first and second images (A and B), a thirdimage (C) of the scene, such that the third image (C) of the scene isdevoid of the person and includes portions of the scene that wereblocked by the person in the first and second images (A and B), whereinthe generating is performing using one or more processors.
 2. The methodof claim 1, wherein the generating the third image (C) of the scenecomprises: using computer vision to identify the person within each ofthe first and second images (A and B); and combining a portion of thefirst image (A) that is devoid of the person with a portion of thesecond image (B) that is devoid of the person to produce the third image(C) of the scene that is devoid of the person and includes the portionsof the scene that were blocked by the person in the first and secondimages.
 3. The method of claim 1, wherein the generating the third image(C) of the scene comprises: identifying first and second portions (A1,A2) of the first image (A) that differ from the second image (B);identifying first and second portions (B1, B2) of the second image (B)that differ from the first image (A); determining a first metric ofsimilarity (a1) indicative of similarity between the first portion (A1)of the first image (A) that differs from the second image (B) and aremaining portion of the first image (A); determining a second metric ofsimilarity (a2) indicative of similarity between the second portion (A2)of the first image (A) that differs from the second image (B) and aremaining portion of the first image (A); determining a third metric ofsimilarity (b1) indicative of similarity between the first portion (B1)of the second image (B) that differs from the first image (A) and aremaining portion of the second image (B); determining a fourth metricof similarity (b2) indicative of similarity between the second portion(B2) of the second image (B) that differs from the first image (A) and aremaining portion of the first image (A); and determining, based on thefirst, second, third, and fourth metrics of similarity (a1, a2, b1, b2),which one of the first portion (A1) of the first image (A) and the firstportion (B1) of the second image (B) is to be included the third image(C), and which one of the second portion (A2) of the first image (A) andthe second portion (B2) of the second image (B) is to be included thethird image (C).
 4. The method of claim 3, wherein the determining,based on the first, second, third, and fourth metrics of similarity (a1,a2, b1, b2), which one of the first portion (A1) of the first image (A)and the first portion (B1) of the second image (B) is to be included thethird image (C), and which one of the second portion (A2) of the firstimage (A) and the second portion (B2) of the second image (B) is to beincluded the third image (C), comprises: comparing a sum of the firstand fourth metrics (a1+b2) to a sum of the second and third metrics(a2+b3); and determining, based on results of the comparing, which oneof the first portion (A1) of the first image (A) and the first portion(B1) of the second image (B) is to be included the third image (C), andwhich one of the second portion (A2) of the first image (A) and thesecond portion (B2) of the second image (B) is to be included the thirdimage (C).
 5. The method of claim 4, wherein the comparing the sum ofthe first and fourth metrics (a1+b2) to the sum of the second and thirdmetrics (a2+b3) comprises determining whether or not the sum of thefirst and fourth metrics (a1+b2) is less than the sum of the second andthird metrics (a2+b3).
 6. The method of claim 4, wherein: for each ofthe first, second, third, and fourth metrics of similarity (a1, a2, b1,b2), a lower magnitude is indicative of higher similarity, and highermagnitude is indicative of a lower similarity; the comparing comprisesdetermining whether the sum of the first and fourth metrics (a1+b2) isless than or greater than the sum of the second and third metrics(a2+b3); and in response to determining that the sum of the first andfourth metrics (a1+b2) is less than the sum of the second and thirdmetrics (a2+b3), determining that the first portion (A1) of the firstimage (A) and the second portion (B2) of the second image (B) are to beincluded in the third image (C); and in response to determining that thesum of the first and fourth metrics (a1+b2) is greater than the sum ofthe second and third metrics (a2+b3), determining that the secondportion (A2) of the first image (A) and the first portion (B1) of thesecond image (B) are to be included in the third image (C).
 7. Themethod of claim 4, wherein: for each of the first, second, third, andfourth metrics of similarity (a1, a2, b1, b2), a lower magnitude isindicative of lower similarity, and higher magnitude is indicative of ahigher similarity; the comparing comprises determining whether the sumof the first and fourth metrics (a1+b2) is less than or greater than thesum of the second and third metrics (a2+b3); and in response todetermining that the sum of the first and fourth metrics (a1+b2) isgreater than the sum of the second and third metrics (a2+b3),determining that the first portion (A1) of the first image (A) and thesecond portion (B2) of the second image (B) are to be included in thethird image (C); and in response to determining that the sum of thefirst and fourth metrics (a1+b2) is less than the sum of the second andthird metrics (a2+b3), determining that the second portion (A2) of thefirst image (A) and the first portion (B1) of the second image (B) areto be included in the third image (C).
 8. The method of claim 1, whereinthe camera has a 360-degree FOV.
 9. The method of claim 1, wherein thefirst and second images (A and B) are captured using a 360-degree camerathat is being controlled by a mobile computing device that is inwireless communication with the 360-degree camera, and the mobilecomputing device comprises one of a smartphone or a tablet type ofmobile computing device.
 10. One or more processor readable storagedevices having instructions encoded thereon which when executed causeone or more processors to perform a method for use with a camera havinga field of view (FOV), the method comprising: obtaining a first image(A) of a scene within the FOV of the camera while a person is at a firstlocation within the FOV of the camera, and thus, the person appears in afirst portion of the first image (A); obtaining a second image (B) ofthe scene within the FOV of the camera while the person is at a secondlocation within the FOV of the camera that differs from the firstlocation, and thus, the person appears in a second portion of the secondimage (B) that differs from the first portion of the first image (A);and generating, based on the first and second images (A and B), a thirdimage (C) of the scene, such that the third image (C) of the scene isdevoid of the person and includes portions of the scene that wereblocked by the person in the first and second images (A and B).
 11. Theone or more processor readable storage devices of claim 10, wherein thegenerating the third image (C) of the scene comprises: using computervision to identify the person within each of the first and second images(A and B); and combining a portion of the first image (A) that is devoidof the person with a portion of the second image (B) that is devoid ofthe person to produce the third image (C) of the scene that is devoid ofthe person and includes the portions of the scene that were blocked bythe person in the first and second images.
 12. The one or more processorreadable storage devices of claim 10, wherein the generating the thirdimage (C) of the scene comprises: identifying first and second portions(A1, A2) of the first image (A) that differ from the second image (B);identifying first and second portions (B1, B2) of the second image (B)that differ from the first image (A); determining a first metric ofsimilarity (a1) indicative of similarity between the first portion (A1)of the first image (A) that differs from the second image (B) and aremaining portion of the first image (A); determining a second metric ofsimilarity (a2) indicative of similarity between the second portion (A2)of the first image (A) that differs from the second image (B) and aremaining portion of the first image (A); determining a third metric ofsimilarity (b1) indicative of similarity between the first portion (B1)of the second image (B) that differs from the first image (A) and aremaining portion of the second image (B); determining a fourth metricof similarity (b2) indicative of similarity between the second portion(B2) of the second image (B) that differs from the first image (A) and aremaining portion of the first image (A); and determining, based on thefirst, second, third, and fourth metrics of similarity (a1, a2, b1, b2),which one of the first portion (A1) of the first image (A) and the firstportion (B1) of the second image (B) is to be included the third image(C), and which one of the second portion (A2) of the first image (A) andthe second portion (B2) of the second image (B) is to be included thethird image (C).
 13. The one or more processor readable storage devicesof claim 12, wherein the determining, based on the first, second, third,and fourth metrics of similarity (a1, a2, b1, b2), which one of thefirst portion (A1) of the first image (A) and the first portion (B1) ofthe second image (B) is to be included the third image (C), and whichone of the second portion (A2) of the first image (A) and the secondportion (B2) of the second image (B) is to be included the third image(C), comprises: comparing a sum of the first and fourth metrics (a1+b2)to a sum of the second and third metrics (a2+b3); and determining, basedon results of the comparing, which one of the first portion (A1) of thefirst image (A) and the first portion (B1) of the second image (B) is tobe included the third image (C), and which one of the second portion(A2) of the first image (A) and the second portion (B2) of the secondimage (B) is to be included the third image (C).
 14. The one or moreprocessor readable storage devices of claim 13, wherein the comparingthe sum of the first and fourth metrics (a1+b2) to the sum of the secondand third metrics (a2+b3) comprises determining whether or not the sumof the first and fourth metrics (a1+b2) is less than the sum of thesecond and third metrics (a2+b3).
 15. The one or more processor readablestorage devices of claim 14, wherein: for each of the first, second,third, and fourth metrics of similarity (a1, a2, b1, b2), a lowermagnitude is indicative of higher similarity, and higher magnitude isindicative of a lower similarity; the comparing comprises determiningwhether the sum of the first and fourth metrics (a1+b2) is less than orgreater than the sum of the second and third metrics (a2+b3); and inresponse to determining that the sum of the first and fourth metrics(a1+b2) is less than the sum of the second and third metrics (a2+b3),determining that the first portion (A1) of the first image (A) and thesecond portion (B2) of the second image (B) are to be included in thethird image (C); and in response to determining that the sum of thefirst and fourth metrics (a1+b2) is greater than the sum of the secondand third metrics (a2+b3), determining that the second portion (A2) ofthe first image (A) and the first portion (B1) of the second image (B)are to be included in the third image (C).
 16. The one or more processorreadable storage devices of claim 14, wherein: for each of the first,second, third, and fourth metrics of similarity (a1, a2, b1, b2), alower magnitude is indicative of lower similarity, and higher magnitudeis indicative of a higher similarity; the comparing comprisesdetermining whether the sum of the first and fourth metrics (a1+b2) isless than or greater than the sum of the second and third metrics(a2+b3); and in response to determining that the sum of the first andfourth metrics (a1+b2) is greater than the sum of the second and thirdmetrics (a2+b3), determining that the first portion (A1) of the firstimage (A) and the second portion (B2) of the second image (B) are to beincluded in the third image (C); and in response to determining that thesum of the first and fourth metrics (a1+b2) is less than the sum of thesecond and third metrics (a2+b3), determining that the second portion(A2) of the first image (A) and the first portion (B1) of the secondimage (B) are to be included in the third image (C).
 17. A system foruse with a camera having a field of view (FOV), the system comprisingone or more processors configured to: obtain a first image (A) of ascene within the FOV of the camera while a person is at a first locationwithin the FOV of the camera, and thus, the person appears in a firstportion of the first image (A); obtain a second image (B) of the scenewithin the FOV of the camera while the person is at a second locationwithin the FOV of the camera that differs from the first location, andthus, the person appears in a second portion of the second image (B)that differs from the first portion of the first image (A); andgenerate, based on the first and second images (A and B), a third image(C) of the scene, such that the third image (C) of the scene is devoidof the person and includes portions of the scene that were blocked bythe person in the first and second images (A and B).
 18. The system ofclaim 17, wherein the one or more processors is/are configured togenerate the third image (C) of the scene by: using computer vision toidentify the person within each of the first and second images (A andB); and combining a portion of the first image (A) that is devoid of theperson with a portion of the second image (B) that is devoid of theperson to produce the third image (C) of the scene that is devoid of theperson and includes the portions of the scene that were blocked by theperson in the first and second images.
 19. The system of claim 17,wherein the one or more processors is/are configured to generate thethird image (C) of the scene by: identifying first and second portions(A1, A2) of the first image (A) that differ from the second image (B);identifying first and second portions (B1, B2) of the second image (B)that differ from the first image (A); determining a first metric ofsimilarity (a1) indicative of similarity between the first portion (A1)of the first image (A) that differs from the second image (B) and aremaining portion of the first image (A); determining a second metric ofsimilarity (a2) indicative of similarity between the second portion (A2)of the first image (A) that differs from the second image (B) and aremaining portion of the first image (A); determining a third metric ofsimilarity (b1) indicative of similarity between the first portion (B1)of the second image (B) that differs from the first image (A) and aremaining portion of the second image (B); determining a fourth metricof similarity (b2) indicative of similarity between the second portion(B2) of the second image (B) that differs from the first image (A) and aremaining portion of the first image (A); and determining, based on thefirst, second, third, and fourth metrics of similarity (a1, a2, b1, b2),which one of the first portion (A1) of the first image (A) and the firstportion (B1) of the second image (B) is to be included the third image(C), and which one of the second portion (A2) of the first image (A) andthe second portion (B2) of the second image (B) is to be included thethird image (C).
 20. The system of claim 17, wherein the one or moreprocessors is/are further configured to use the third image (C) of thescene, which is devoid of the person, to generate a schematic, blueprintor other graphical representation of a room.